explicit content
Highguard, a hyperpop arena shooter and other new indie games worth checking out
Welcome to our latest roundup of what's going on in the indie game space. There are tons of interesting games out this week. But first, there's been some discourse around the Nintendo Switch version of, which arrived this week as well. On other platforms, there's an option to censor genitalia and other explicit content, but that's not present in the Switch version. Instead, such content is censored by default, with black rectangles covering up characters' bits and someone flipping the bird.
- Information Technology > Artificial Intelligence > Games > Computer Games (0.71)
- Information Technology > Communications > Mobile (0.49)
Language models for longitudinal analysis of abusive content in Billboard Music Charts
Chandra, Rohitash, Suresh, Yathin, Sinha, Divyansh Raj, Jindal, Sanchit
There is no doubt that there has been a drastic increase in abusive and sexually explicit content in music, particularly in Billboard Music Charts. However, there is a lack of studies that validate the trend for effective policy development, as such content has harmful behavioural changes in children and youths. In this study, we utilise deep learning methods to analyse songs (lyrics) from Billboard Charts of the United States in the last seven decades. We provide a longitudinal study using deep learning and language models and review the evolution of content using sentiment analysis and abuse detection, including sexually explicit content. Our results show a significant rise in explicit content in popular music from 1990 onwards. Furthermore, we find an increasing prevalence of songs with lyrics containing profane, sexually explicit, and otherwise inappropriate language. The longitudinal analysis of the ability of language models to capture nuanced patterns in lyrical content, reflecting shifts in societal norms and language use over time.
- Oceania > Australia > New South Wales > Sydney (0.04)
- North America > United States > Massachusetts (0.04)
- North America > United States > Indiana (0.04)
- (5 more...)
- Media > Music (1.00)
- Leisure & Entertainment (1.00)
- Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)
LLMs for Translation: Historical, Low-Resourced Languages and Contemporary AI Models
Large Language Models (LLMs) have demonstrated remarkable adaptability in performing various tasks, including machine translation (MT), without explicit training. Models such as OpenAI's GPT-4 and Google's Gemini are frequently evaluated on translation benchmarks and utilized as translation tools due to their high performance. This paper examines Gemini's performance in translating an 18th-century Ottoman Turkish manuscript, Prisoner of the Infidels: The Memoirs of Osman Agha of Timisoara, into English. The manuscript recounts the experiences of Osman Agha, an Ottoman subject who spent 11 years as a prisoner of war in Austria, and includes his accounts of warfare and violence. Our analysis reveals that Gemini's safety mechanisms flagged between 14 and 23 percent of the manuscript as harmful, resulting in untranslated passages. These safety settings, while effective in mitigating potential harm, hinder the model's ability to provide complete and accurate translations of historical texts. Through real historical examples, this study highlights the inherent challenges and limitations of current LLM safety implementations in the handling of sensitive and context-rich materials. These real-world instances underscore potential failures of LLMs in contemporary translation scenarios, where accurate and comprehensive translations are crucial-for example, translating the accounts of modern victims of war for legal proceedings or humanitarian documentation.
- Europe > Romania > Vest Development Region > Timiș County > Timișoara (0.24)
- Europe > Austria (0.24)
- Asia > Middle East > Republic of Türkiye (0.14)
- (6 more...)
- Law (0.88)
- Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.46)
- Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Effective Black-Box Multi-Faceted Attacks Breach Vision Large Language Model Guardrails
Yang, Yijun, Wang, Lichao, Yang, Xiao, Hong, Lanqing, Zhu, Jun
Vision Large Language Models (VLLMs) integrate visual data processing, expanding their real-world applications, but also increasing the risk of generating unsafe responses. In response, leading companies have implemented Multi-Layered safety defenses, including alignment training, safety system prompts, and content moderation. However, their effectiveness against sophisticated adversarial attacks remains largely unexplored. In this paper, we propose MultiFaceted Attack, a novel attack framework designed to systematically bypass Multi-Layered Defenses in VLLMs. It comprises three complementary attack facets: Visual Attack that exploits the multimodal nature of VLLMs to inject toxic system prompts through images; Alignment Breaking Attack that manipulates the model's alignment mechanism to prioritize the generation of contrasting responses; and Adversarial Signature that deceives content moderators by strategically placing misleading information at the end of the response. Extensive evaluations on eight commercial VLLMs in a black-box setting demonstrate that MultiFaceted Attack achieves a 61.56% attack success rate, surpassing state-of-the-art methods by at least 42.18%.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- North America > Mexico > Mexico City > Mexico City (0.04)
- Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
- (4 more...)
- Research Report > New Finding (0.67)
- Research Report > Promising Solution (0.48)
Sensitive Content Classification in Social Media: A Holistic Resource and Evaluation
Antypas, Dimosthenis, Sen, Indira, Perez-Almendros, Carla, Camacho-Collados, Jose, Barbieri, Francesco
The detection of sensitive content in large datasets is crucial for ensuring that shared and analysed data is free from harmful material. However, current moderation tools, such as external APIs, suffer from limitations in customisation, accuracy across diverse sensitive categories, and privacy concerns. Additionally, existing datasets and open-source models focus predominantly on toxic language, leaving gaps in detecting other sensitive categories such as substance abuse or self-harm. In this paper, we put forward a unified dataset tailored for social media content moderation across six sensitive categories: conflictual language, profanity, sexually explicit material, drug-related content, self-harm, and spam. By collecting and annotating data with consistent retrieval strategies and guidelines, we address the shortcomings of previous focalised research. Our analysis demonstrates that fine-tuning large language models (LLMs) on this novel dataset yields significant improvements in detection performance compared to open off-the-shelf models such as LLaMA, and even proprietary OpenAI models, which underperform by 10-15% overall. This limitation is even more pronounced on popular moderation APIs, which cannot be easily tailored to specific sensitive content categories, among others.
- Europe > United Kingdom (0.04)
- Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
- Europe > Central Europe (0.04)
- Asia > Taiwan (0.04)
- Information Technology > Communications > Social Media (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.35)
Unveiling Concept Attribution in Diffusion Models
Nguyen, Quang H., Phan, Hoang, Doan, Khoa D.
Diffusion models have shown remarkable abilities in generating realistic and high-quality images from text prompts. However, a trained model remains black-box; little do we know about the role of its components in exhibiting a concept such as objects or styles. Recent works employ causal tracing to localize layers storing knowledge in generative models without showing how those layers contribute to the target concept. In this work, we approach the model interpretability problem from a more general perspective and pose a question: \textit{``How do model components work jointly to demonstrate knowledge?''}. We adapt component attribution to decompose diffusion models, unveiling how a component contributes to a concept. Our framework allows effective model editing, in particular, we can erase a concept from diffusion models by removing positive components while remaining knowledge of other concepts. Surprisingly, we also show there exist components that contribute negatively to a concept, which has not been discovered in the knowledge localization approach. Experimental results confirm the role of positive and negative components pinpointed by our framework, depicting a complete view of interpreting generative models. Our code is available at \url{https://github.com/mail-research/CAD-attribution4diffusion}
- Europe > Switzerland > Zürich > Zürich (0.14)
- North America > United States > New York (0.04)
- North America > Mexico > Mexico City > Mexico City (0.04)
- Asia > Vietnam (0.04)
Safe Text-to-Image Generation: Simply Sanitize the Prompt Embedding
Qiu, Huming, Chen, Guanxu, Zhang, Mi, Yang, Min
In recent years, text-to-image (T2I) generation models have made significant progress in generating high-quality images that align with text descriptions. However, these models also face the risk of unsafe generation, potentially producing harmful content that violates usage policies, such as explicit material. Existing safe generation methods typically focus on suppressing inappropriate content by erasing undesired concepts from visual representations, while neglecting to sanitize the textual representation. Although these methods help mitigate the risk of misuse to certain extent, their robustness remains insufficient when dealing with adversarial attacks. Given that semantic consistency between input text and output image is a fundamental requirement for T2I models, we identify that textual representations (i.e., prompt embeddings) are likely the primary source of unsafe generation. To this end, we propose a vision-agnostic safe generation framework, Embedding Sanitizer (ES), which focuses on erasing inappropriate concepts from prompt embeddings and uses the sanitized embeddings to guide the model for safe generation. ES is applied to the output of the text encoder as a plug-and-play module, enabling seamless integration with different T2I models as well as other safeguards. In addition, ES's unique scoring mechanism assigns a score to each token in the prompt to indicate its potential harmfulness, and dynamically adjusts the sanitization intensity to balance defensive performance and generation quality. Through extensive evaluation on five prompt benchmarks, our approach achieves state-of-the-art robustness by sanitizing the source (prompt embedding) of unsafe generation compared to nine baseline methods. It significantly outperforms existing safeguards in terms of interpretability and controllability while maintaining generation quality.
- Law (1.00)
- Information Technology > Security & Privacy (1.00)
- Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Addiction Disorder (0.34)
Can AI image generators be policed to prevent explicit deepfakes of children?
Child abusers are creating AI-generated "deepfakes" of their targets in order to blackmail them into filming their own abuse, beginning a cycle of sextortion that can last for years. Creating simulated child abuse imagery is illegal in the UK, and Labour and the Conservatives have aligned on the desire to ban all explicit AI-generated images of real people. But there is little global agreement on how the technology should be policed. Worse, no matter how strongly governments take action, the creation of more images will always be a press of a button away – explicit imagery is built into the foundations of AI image generation. In December, researchers at Stanford University made a disturbing discovery: buried among the billions of images making up one of the largest training sets for AI image generators was hundreds, maybe thousands, of instances of child sexual abuse material (CSAM).
- Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
- Law (1.00)
- Health & Medicine > Therapeutic Area > Pediatrics/Neonatology (0.56)
A Framework for Portrait Stylization with Skin-Tone Awareness and Nudity Identification
Kim, Seungkwon, Kim, Sangyeon, Nam, Seung-Hun
Net and a fine-tuned SD model exhibits acceptable performance, as shown in the upper part of Figure 1. The Webtoon phenomenon has evolved beyond traditional paperbased Despite the breadth of existing studies, designing a portrait comics. It uses information technology to present content that stylization framework at the business level remains challenging, as is both produced and consumed in a digital format, and it is rapidly shown in the bottom part of Figure 1. First, concerns exist over skintone gaining global popularity. Webtoon is thus well positioned as an optimal expression, in which a model uniformly alters users' actual skin environment for integration with generative AI. In this regard, tones to match those of a specific trained style, possibly leading to portrait stylization has been an active research area, in which given ethical issues. Second, malicious users could generate sexual content individual photographs are translated into specific art styles to enhance with a specific style. In IP-based businesses, safeguarding the the value of intellectual property (IP) by delivering a distinct IP is crucial; unfortunately, the neglect of this issue in existing studies sense of enjoyment to users [1].
Extraction and Summarization of Explicit Video Content using Multi-Modal Deep Learning
Joshi, Shaunak, Gaggar, Raghav
With the increase in video-sharing platforms across the internet, it is difficult for humans to moderate the data for explicit content. Hence, an automated pipeline to scan through video data for explicit content has become the need of the hour. We propose a novel pipeline that uses multi-modal deep learning to first extract the explicit segments of input videos and then summarize their content using text to determine its age appropriateness and age rating. We also evaluate our pipeline's effectiveness in the end using standard metrics.
- North America > United States > California (0.14)
- North America > Canada (0.04)
- Leisure & Entertainment (0.69)
- Media > Film (0.47)
- Information Technology (0.46)